Search CORE

246 research outputs found

Optimal control methods for simulating the perception of causality in young infants

Author: Barto Andrew
Schlesinger Matthew
Publication venue
Publication date: 01/01/1999
Field of study

There is a growing debate among developmental theorists concerning the perception of causality in young infants. Some theorists advocate a top-down view, e.g., that infants reason about causal events on the basis of intuitive physical principles. Others argue instead for a bottom-up view of infant causal knowledge, in which causal perception emerges from a simple set of associative learning rules. In order to test the limits of the bottom-up view, we propose an optimal control model (OCM) of infant causal perception. OCM is trained to find an optimal pattern of eye movements for maintaining sight of a target object. We first present a series of simulations which illustrate OCM's ability to anticipate the outcome of novel, occluded causal events, and then compare OCM's performance with that of 9-month-old infants. The impications for developmental theory and research are discusse

CiteSeerX

ScholarWorks@UMass Amherst

CogPrints Cognitive Sciences Eprint Archive

Learning Parameterized Skills

Author: Barto Andrew
Da Silva Bruno
Konidaris George
Publication venue
Publication date: 01/01/2012
Field of study

We introduce a method for constructing skills capable of solving tasks drawn from a distribution of parameterized reinforcement learning problems. The method draws example tasks from a distribution of interest and uses the corresponding learned policies to estimate the topology of the lower-dimensional piecewise-smooth manifold on which the skill policies lie. This manifold models how policy parameters change as task parameters vary. The method identifies the number of charts that compose the manifold and then applies non-linear regression in each chart to construct a parameterized skill by predicting policy parameters from task parameters. We evaluate our method on an underactuated simulated robotic arm tasked with learning to accurately throw darts at a parameterized target location.Comment: Appears in Proceedings of the 29th International Conference on Machine Learning (ICML 2012

arXiv.org e-Print Archive

CiteSeerX

Adaptive Critics and the Basal Ganglia

Author: Barto Andrew G.
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/01/1995
Field of study

One of the most active areas of research in artificial intelligence is the study of learning methods by which “embedded agents ” can improve performance while acting in complex dynamic environments. An agent, or decision maker, is embedded in an environment when it receives information from, and acts on, that environment in an ongoing closed-loop interaction. An embedded agent has to make decisions under time pressure and uncertainty and has to learn without the help of an ever-present knowledgeable teacher. Although the novelty of this emphasis may be inconspicuous to a biologist, animals being the prototypical embedded agents, this emphasis is a significant departure from the more traditional focus in artificial intelligence on reasoning within circumscribed domains removed from the flow of real-world events. One consequence of the embedded agent view is the increasing interest in the learning paradigm called reinforcement learning (RL). Unlike the more widely studied supervised learning systems, which learn from a set of examples of correct input/output behavior, RL systems adjust their behavior with the goal of maximizing the frequency and/or magnitude of the reinforcing events they encounter over time. While the core ideas of modern RL come from theories of animal classical and instrumenta

CiteSeerX

ScholarWorks@UMass Amherst

Using Relative Novelty to Identify Useful Temporal Abstractions in Reinforcement Learning

Author: Barto Andrew G.
Şimşek Özgür
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/01/2004
Field of study

We present a new method for automatically creating useful temporal abstractions in reinforcement learning. We argue that states that allow the agent to transition to a different region of the state space are useful subgoals, and propose a method for identifying them using the concept of relative novelty. When such a state is identified, a temporallyextended activity (e.g., an option) is generated that takes the agent efficiently to this state. We illustrate the utility of the method in a number of tasks

CiteSeerX

Crossref

ScholarWorks@UMass Amherst

Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density

Author: Barto Andrew G.
McGovern Amy
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/01/2001
Field of study

This paper presents a method by which a reinforcement learning agent can automatically discover certain types of subgoals online. By creating useful new subgoals while learning, the agent is able to accelerate learning on the current task and to transfer its expertise to other, related tasks through the reuse of its ability to attain subgoals. The agent discovers subgoals based on commonalities across multiple paths to a solution. We cast the task of finding these commonalities as a multiple-instance learning problem and use the concept of diverse density to find solutions. We illustrate this approach using several gridworld tasks

CiteSeerX

ScholarWorks@UMass Amherst

Betweenness Centrality as a Basis for Forming Skills

Author: Barto Andrew G.
Şimşek Özgür
Publication venue
Publication date: 12/04/2007
Field of study

OPUS

Accelerating Reinforcement Learning through the Discovery of Useful Subgoals

Author: Barto Andrew G.
McGovern Amy
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/01/2001
Field of study

An ability to adjust to changing environments and unforeseen circumstances is likely to be an important component of a successful autonomous space robot. This paper shows how to augment reinforcement learning algorithms with a method for automatically discovering certain types of subgoals online. By creating useful new subgoals while learning, the agent is able to accelerate learning on a current task and to transfer its expertise to related tasks through the reuse of its ability to attain subgoals. Subgoals are created based on commonalities across multiple paths to a solution. We cast the task of finding these commonalities as a multiple-instance learning problem and use the concept of diverse density to find solutions. We introduced this approach in [10] and here we present additional results for a simulated mobile robot task

CiteSeerX

ScholarWorks@UMass Amherst

Novelty or surprise?

Author: Baldassarre Gianluca
Barto Andrew
Mirolli Marco
Publication venue: Frontiers Media S.A.
Publication date
Field of study

Novelty and surprise play signiﬁcant roles in animal behavior and in attempts to understand the neural mechanisms underlying it. They also play important roles in technology, where detecting observations that are novel or surprising is central to many applications, such as medical diagnosis, text processing, surveillance, and security. Theories of motivation, particularly of intrinsic motivation, place novelty and surprise among the primary factors that arouse interest, motivate exploratory or avoidance behavior, and drive learning. In many of these studies, novelty and surprise are not distinguished from one another: the words are used more-or-less interchangeably. However, while undeniably closely related, novelty and surprise are very different. The purpose of this article is ﬁrst to highlight the differences between novelty and surprise and to discuss how they are related by presenting an extensive review of mathematical and computational proposals related to them, and then to explore the implications of this for understanding behavioral and neuroscience data. We argue that opportunities for improved understanding of behavior and its neural basis are likely being missed by failing to distinguish between novelty and surpris

PUblication MAnagement